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ABSTRACT 

A computerized testing system was installed on an 
experimental basis at the Basic Electricity and Electronics School of 
the Naval Training Center in San Diego. The system consisted of a 
network of IBM Personal Computers running a slightly modified version 
of the commercially available MicroCAT Testing System. It was 
configured to fit transparently into the school's computer-managed 
instruction system. After a few minor adjustments and a few added 
features, the system met its goal of paralleling the paper-and-pencil 
version of the tests with a minimum of change in standard testing 
procedures. Now in place, the system provides a base on which 
diagnostic testing research can begin. Diagnostic testing will be 
implemented using the custom interface included in MicroCAT, which 
allows users to link FORTRAN or Pascal procedures to MicroCAT. 
( Author /JAZ) 
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ABSTRACT 



A computerized testing system was installed on an experimental basis at the 
Basic Electricity and Electronics School of the Naval Training Center in San 
Diego. The system consisted of a network of IBM Personal Computers running 
a slightly modified version of the commercially available MicroCAT*"* Testing 
System. It was configured to fit transparently into the school's computer- 
managed instruction system. After a few minor adjustments and a few added 
features, the system met its goal of paralleling the paper-and-pencil version of 
the tests with a minimum of change in standard testing procedures. Now in 
place, the sy.stem provides a base on which diagnostic testing research can 
begin. 
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INTRODUCTION 



Achievement testing takes up a substantial portion of a trainee's time in a 
self-paced military service technical school because continual assessment of the 
trainee's skills is necessary to pace the instruction. Obviously, anything that 
can be done to make testing more efficient or to extract better information 
from the testing process will enhance the quality of training. Several forms of 
computerized testing, including computerized adaptive testing (Weiss, 1982, 
1985) and computer-based diagnostic testing (Tatsuoka & Tatruoka, 1983), offer 
the promise of such an improvement. 

Computer-based instruction and testing in the service schools requires 
reliable, inexpensive computer equipment that can handle a variety of 
presentation forms. Among the forms such equipment must handle are standard 
computer-based instruction and conventional, adaptive, or diagnostic testing. 
Although a variety of software systems for computer-based instruction are 
available, very few software systems are available for implementing adaptive 
or diagnostic testing. The MicroCAT*"* Testing System (Assessment Systems, 
1984) is a generic testing system that can be used for most forms of testing and 
many forms of computer-based instruction. 

The development of the MicroCAT system was partially supported by funds 
from the Office of Naval Research (ONR). A major objective of ONR in 
supporting this development was to provide a testing system to meet the needs 
of the training and achievement-testing environment. To test its effectiveness 
in this environment, MicroCAT was implemented in a Navy Training Center as 
a means of introducing diagnostic testing into one of the service technical 
schools. 

The system was implemented at the Basic Electricity and Electronics 
(BE&E) School at the Naval Training Center in San Diego, California. The 
overall implementation plan was to introduce a computerized testing system 
into the current testing process and, once this system was in place and tested, to 
extend the program to diagnostic testing. This report describes the design and 
initial implementation of this system. 



DESIGN OF THE TESTING SYSTEM 

When this project began, the design of the MicroCAT Testing System was 
nearly complete and many of the MicroCAT programs had been developed. The 
objectives of the design of the testing system for the BE&E School were: (1) to 
assess the testing needs of the school, (2) to expand the MicroCAT system to 
allow the strategies for diagnostic testing to be implemented, and (3) to 
integrate the system into the testing environment and the computer-managed 
instructional system that were already in place at the school. 

System Requirements 

Students at the BE&E School are tested approximately once a day. The 
student studies a particular subject and then takes a lest on that subject. The 
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achievement tests used in the BE&E curriculum contain 8 to 50 questions on 
basic electricity and electronics knowledge. Typically they consist of some 
form of graphic (e,g., a schematic or a chart) and a question, often using special 
symbols (such as an omega for ohms). Figure 1 shows a sample item (not 
actually used in the BE&E curriculum) on resistance analysis. To solve this 
problem, the examinee must know how to apply Ohm's law and must either 
recognize that the bridge on the right of the schematic is balanced (thus 
providing a computational shortcut) or apply an appropriate network theorem 
to determine the overall resistance, and thus currcat, in the system. 



Figure 1. Sample Electronics Item 
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How nuch cux^x^ent will flow 
in this circuit? 

A . 21 ni 1 1 i anpepes 

B. 63 ni 1 1 ianpex^es 

C. 127 Mill ianpepes 

D. 254 ni 1 1 ianpei?es 
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To take a test in the conventional paper-and-pencil format, a student 
reports to a testing room and is assigned a microfiche card containing the test. 
The student then goes to a testing carrel containing a microfiche reader, loads 
the test into it, and responds to the questions by marking an optically scannable 
answer sheet. After the student completes the test, he or she puts the answer 
sheet into an optical scanner, which reads the answer sheet and transmits the 
information to MIISA, the computer-managed instruction system running on a 
mainframe computer in Memphis, Tennessee. MIISA determines that the test 
the examinee took was the proper one, scores it, reports the results, and updates 



the student's record in the database. The student receives the reported results 
on a printing computer terminal connected to the optical scanner. This report 
tells the student his or her score and what test to take next. 

During the initial phases of implementation of the computerized testing 
system, students could take tests using either the computerized system or the 
conventional microfiche cards. The computerized tests had to be psychometri- 
cally comparable to the microfiche tests because all scores would be interpreted 
on the same scale. It was important that the tests be psychologically compara- 
ble as well, because if students perceived a difference in the difficulty of the 
tests, either real or imagined, they might avoid the form they considered to be 
more difficult or troublesome. Three factors that contribute to the psychologi- 
cal comparability of the forms are: (1) speed of system response to the 
examinee, (2) fault tolerance during system failures, and (3) support of 
standard test-taking strategies. To avoid giving the examinee the impression 
that the computerized version is slower than the conventional version, a goal 
for the maximum system time between the examinee's response and the 
presentation of the next item was set at less than five seconds. It is also 
important for examinees to feel confident that their work will not be lost 
because of equipment failure. And finally, a major test-taking strategy that 
must be supported is the examinee's ability to skip items and then return to 
them at the end of the test. 

The computerized testing system also had to fit into the existing computer- 
managed instructional system without requiring any programming on the part 
of the Navy. This essentially meant that no changes in the testing process 
could be made that would be detected by MIISA. 

Finally, the system had to be able to handle special cases. An example of a 
special case in the traditional testing mode would be a mis-scanned answer 
sheet that failed to give credit for all correct responses. Another would be the 
loss of an examinee's record after its receipt had been acknowledged by MIISA. 
In the conventional testing format, special cases are handled by the test proctor, 
who interacts with MIISA on the printer terminals used to return examinee test 
results. A similar means of proctor intervention had to be made available with 
the computerized testing system. 

Analysis of the Systems 

MIISA: The Navy's Computer-Managed Instruction System 

All instruction and testing at the BE&E School is managed by MIISA, the 
program that assigns and scores tests and tracks student progress throughout the 
entire course of study. It is a very large program running on a mainframe 
computer at a central computer installation. Because of its size and distance 
from the BE&E School, it is very difficult to make any changes to the program. 
Therefore, the computerized testing system had to use existing MIISA 
interfaces. The most convenient interface was with the printer terminals 
through which scanned test responses are transmitted and score reports are 
received. 



The printer terminals used are General Electric Terminet terminals. These 
terminals contain sufficient intelligence to read the data from the optical 
scanner, add a header of approximately 20 characters, and transmit the 
transaction to MIISA. Data are transmitted from the Terminet through a 
standard RS232 serial port. The data are communicated through a 1200-baud 
modem to a local concentrator and then transmitted to MIISA at 9600 baud. 

The transactions sent to MIISA are all single lines of ASCII characters 
terminated with a carriage return. Data returned from MIISA are score reports 
formatted for the Terminet's printer. Among the transactions of interest to this 
project are score reports and requests for tests to be taken. It was apparent 
that a convenient way to connect to the existing system was to emulate the 
Terminet terminals, sending proper Terminet transactions and receiving score 
reports. 

The MicroCAT Testing System 

The MicroCAT Testing System was designed to be a self-contained system 
for developing, administering, and analyzing adaptive tests. The system is 
packaged :nto four subsystems: the Development Subsystem, the Examination 
Subsystem, the Assessment Subsystem, and the Management Subsystem. The 
programs available in each of these subsystems are shown in Table 1. 

The Development Subsystem contains programs for entering and editing 
test items consisting of text and graphics and for arranging those items into 
tests using a number of conventional and adaptive testing strategies. Tests are 
specified in MCATL, an authoring language designed especially for specifying 
tests. (This .pecification may also be accomplished by filling in blanks in pre- 
defined straitgy templates.) MCATL is compiled to an intermediate form of 
code that can be executed quickly during the testing process. 

The Examination Subsystem administers the tests. The programs in this 
subsystem read the test specification instructions (the intermediate code file 
generated by compiling the test specification), present the test items, accept the 
examinee's responses, score the responses, and report the results in a data file. 

The Assessment Subsystem contains programs for analyzing tests that have 
been administered. One program, ASCAL, estimates item response theory (IRT) 
item parameters. Other programs in this subsystem reformat data for analyses, 
perform conventional item analyses, evaluate characteristics of item pools, and 
perform test validation analyses. 

The Management Subsystem is intended for use with a network of testing 
stations. Some programs in the Management Subsystem allow a proctor to 
monitor testing at a number of testing stations from a single terminal; others 
store individual examinee data in a master data file. 



Table 1. MicroCAT Components 



Development Subsystem 

BANK: Enters and edits text and graphics items 
MAKEFONT: Generates special-purpose character sets 
CREATE: Creates tests using pre-defined templates 
EDIT: Enters and edits MCATL test specifications 
COMPILE: Compiles test specifications 

Examination Subsystem 

TESTONE: Tests one examinee and writes the score to a file 
TESTMANY: Tests examinees repeatedly and writes scores to a file 

Assessment Subsystem 

COLLECT: Collects and formats item response data 
ANALYZE: Performs conventional item and test analyses 
ESTIMATE: Estimates IRT item parameters using ASCAL*"* 
EVALUATE: Pre-evaluates a test's potential using IRT 
VALIDATE: Performs test validation analyses 

Management Subsystem 

RESERVE: Reserves disk space for communication 

PROCTOR: Proctors test administration from a command station 

RETRIEVE: Retrieves data from the master data file 



Two substantial modifications to the MicroCAT system were planned to 
incorporate it into the Naval Training Center (NTC) testing environment. At 
the time this implementation was planned, the MicroCAT system had no facility 
by which an examinee could skip an item and later review it, or change 
responses to any items previously administered. Such capabilities are not 
typically allowed in adaptive testing. However, to maintain psychological 
comparability to the existing conventional testing process, such an addition was 
necessary. The second addition was the incorporation of communication 
facilities so that the MicroCAT Testing System could communicate with MIISA. 
The planned approach to this was to enhance the proctoring program so that it 
could communicate with MIISA by emulating the General Electric Terminet 
terminals and the standard transaction protocols. 
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Integrating the Resources 



The MicroCAT Testing System runs on IBM Personal Computers. Each 
individual testing station has one such computer. An IBM Personal Computer 
consists of three major components: a monitor, a system unit, and a standard 
keyboard with several additional function keys added at each end. 

For the fault-tolerant testing system required for the NTC implementation, 
18 IBM Personal Computers were connected via an EtherNet local area network. 
A diagram of the system is shown in Figure 2. 



Figure 2. Structure of the NTC Implementation 
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The two network servers shown at the top of Figure 2 contain the tests that 
are administered and the data that are collected. The two servers in the NTC 
system are IBM PC-XT computers. Each has 256 kb of RAM memory, one 360- 
kb diskette drive, and one 10-mb hard-disk drive. Each server contains all of 
the tests to be administered. In normal operation, each server serves half of the 
testing stations. If either of the servers fails, the other one is capable of 
handling the entire testing system. 



All remaining terminals on the network are IBM PCs with 192 kb of RAM 
memory and a single diskette drive. The single diskette drive contains only 
those programs necessary to link each terminal into the network and the 
current examinee's responses for test recovery in case the testing station fails. 

Two of the testing stations are configured to function as proctoring 
stations. In addition to the standard testing system hardware, they contain a 
serial port to communicate with MIISA and a printer to print test results. In 
normal operation only one proctoring station is used; the other is used as a 
standard testing station and is available as a backup if the proctoring station 
fails. 



IMPLEMENTATION OF THE SYSTEM 

Description of the Initial System 

The initial system was organized functionally as described above and in 
Figure 2. To operate the system, the test proctor first has to start the network 
servers by turning on the power, entering the date and time, and making a 
single keystroke to start the network server in a normal fashion. 

After starting the servers, the proctor turns on the proctor station and all 
of the testing stations. All of these terminals automatically link into the 
network and establish a connection with the proper server. The server to which 
each testing station connect? is determined by data contained on the diskette 
within the testing station. 

The proctoring station presents a message asking the proctor if the test 
request queue should be cleared. In the case of a normal start, this is always 
done. Only in unusual circumstances, such as recovery after a power failure, 
would the proctor not clear the test request queue. The communications link 
with MIISA is automatically established by turning on the modem. At this 
point the system is ready for operation. 

When an examinee arrives to take a test, the proctor assigns him or her to 
one of the available testing stations. Each available testing station displays the 
message that the examinee should enter his or her Social Security number and 
press the return key. When the examinee does this, the Social Security number 
is passed through the network to the server and from the server to the 
proctoring station, where it is formatted into a transaction asking MIISA what 
test should be assigned to the examinee. MIISA then responds with a report, 
and the program running on the proctoring station extracts the test identifier 
from that leport. It passes the test identifier back through the network to the 
testing station where the examinee is waiting for a test. This process typically 
takes between S and 10 seconds. 

As items arc administered at the testing station, each response is edited to 
ensure that it is valid. When the examinee finishes a test, the response record 
is passed through the network to the proctoring station, which formats it into a 
transaction and transmits it to MIISA. MIISA then scores the test, updates the 
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examinee's course record, and transmits a report to the proctoring station. The 
proctoring station then passes this report to the system printer, from which the 
examinee obtains his or her score report. The testing process is complete at this 
point. 

Initial Evaluation 

For the most part, the initial system ran without error. Students taking 
tests on the system were reliably tested and always received proper reports 
from MIISA. However, the Navy chiefs in charge of the testing process noticed 
two potential difficulties with the system. First, they determined that it was 
possible for a student to run two terminals simultaneously. By doing this, a 
student could preview the items on one station and then answer them on a 
second station. Since there was no feedback given about the correctness of 
responses, there was really no advantage to be gained from doing this, but it 
wan nevertheless of concern to the chiefs. The second potential problem was 
that students could reset their testing stations with several combinations of keys 
(e.g., control-c, and the system reset combination of control-alt-delete). 

System Revisions 

To alleviate the first problem identified in the initial evaluation, a lockout 
buffer was incorporated into the proctoring station to prevent a student from 
orerating more than one station at a time. When a student logs into the system, 
his or her Social Security number is kept in a buffer and is not deleted until he 
or she completes the test. If the student tries to log in at another station, a 
message appears informing him or her that this is not allowed, and the proctor 
is alerted at the proctoring station. 

To solve the reset problem, most of the control key combinations that could 
reset the testing station were disabled. However, the control-alt-delete 
combination is buried deep in the hardware of the IBM Personal Computer as 
the system reset and there is no way to disable it from the software. This was 
considered a relatively minor problem, however, because it is extremely 
unlikely that a student would hit this combination of keys accidentally, and 
anyone who was determined to reset the station could always do so by turning 
the power off, even if the control-alt-deiete combination could have been 
disabled. 

Additional Features 

Several rdditional features were added to the system, some of which had 
not been initially injiended. The first was a modification to allow students to 
take remedial tests on the computerized testing system. (Remedial tests are for 
students who fail a particular portion of a test and must retake only that 
portion after additional study.) In the microfiche mode, the student simply 
answers the questions in that section and leaves all of the other sections on the 
answer sheet blank. If the student accidentally answers items in any other 
section of the test, the test record is rejected by MIISA. To allow remedial 
examinations in the computerized testing system, two modifications were made. 
First, standard testing mode was altered to allow students in remedial mode to 
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skip sections of the test. Remedial test sections without any responses are 
simply ignored by MIISA. Another modification was necessary to solve the 
problem that occurred when an examinee accidentally answered an item in the 
wrong section, causing MIISA to reject the entire test record. In this case, the 
response vector is analyzed at the proctoring station, and any section in the test 
that has some but not all of the items answered is completely blanked as if the 
examinee had answered no items in that section. That section is then ignored 
by MIISA, and only the section that has all items answered is scored. With 
these modifications, the computerized mode is virtually identical to the paper- 
and-pencil mode of remedial testing. 

A second feature that was added to the system was the capability to 
retransmit an examinee's test record directly from the proctoring station. 
Occasionally, the MIISA system would accept a test record and produce a report 
but then lose the test record. The proctors then had to re-enter the record by 
hand using the communication capability provided in the proctoring station. 
To solve this problem, a facility was incorporated into the proctoring program 
that would retransmit the entire test record from the recovery file on the 
testing station's diskette. 

Because the data collected by computer administration were to be analyzed 
by the University of Illinois, a data transfer scheme was needed. The MIISA 
link is a real-time link in that testing waits for communication. Transferring 
the data to the University of Illinois, on the other hand, had to be done only 
when the data were needed or when the disks on the NTC network were full. 
A system was developed whereby the test proctor periodically dumped the data 
from the system disks to two sets of diskettes, one for the University of Illinois 
and one for backup. After dumping the data, the proctor was instructed to 
mail one set to the University of Illinois and to keep the backup set until 
receipt was confirmed. The data on the system disk were erased after the 
diskettes were made. Except for the difficulty of getting the proctor to make 
the data diskettes on a regular basis, this scheme worked well. 

Testing has not been interrupted brcause of any system problems. It was 
interrupted for several weeks, however, by the implementation of new versions 
of the tests. The frequent changes in tests, which had not been anticipated 
when the system was installed, required frequent communication with the 
University of Illinois. It had been intended that the University of Illinois 
would do the test development and then either manually install the tests in the 
San Diego system or mail complete test files with installation programs to be 
run by the proctor. Ho ever, as the lest changes became more frequent, it 
became apparent that it would be more efficient for NTC personnel to make 
the changes themselves and install the tests. 

Test development in the MicroCAT system is a three-stage process. First, 
the items are authored using the system's Graphics Item Banker. Then the test 
is specified using an authoring language. Finally, the authoring language is 
compiled, a process that reformats the items and processes the instructions in a 
manner that allows items to be presented rapidly. Implementing a test in the 
NTC system required the further step of copying the compiled test onto the 
appropriate disk volume. 
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NTC test administration personnel mastered the process with relative ease. 
Fowever, a few problems did arise. One problem was that if diskettes were 
swapped while the item banker was running, a bank would be destroyed. 
Although this problem is easily circumvented by not swapping diskettes, this 
solution was obviously not optimal. A utility program that could recover a 
bank destroyed in this manner was developed. 

A second problem that was encountered was that two people sharing a disk 
volume using the Ethernet network from 3Com can, under certain 
circumstances, destroy each other's work. For example, NTC personnel 
destroyed an item bank by writing portions of a memo over it. Fortunately, the 
new program v/as able to restore most of what was lost. 



EVALUATION OF THE SYSTEM 

The MicroCAT Testing System was implemented at the BE&E School to 
provide a vehicle for diagnostic testing and to evaluate the MicroCAT system 
in a full-scale operational testing environment. In general, the MicroCAT 
system has performed admirably. To date, approximately 2,400 items have been 
banked for this application. From these, approximately 50 different tests have 
been implemented, and approximately 1,500 tests have been administered. 
Informal evidence from the BE&E School suggests that the system is fast 
enough for all testing needs, that examinee's perceive it as psychologically 
parallel to the microfiche form of testing, and that it is adequately fault- 
tolerant. Although the local testing system rarely fails, the capability to 
retransmit data if MIISA loses the original transmission has been very valuable. 

As an evaluation site for the MicroCAT system, the NTC environment has 
been less than optimal. To date, only the conventional testing capabilities of 
the MicroCAT Testing System have been evaluated to any degree. The 
considerable power for adaptive test administration and analysis that is a major 
strength of the MicroCAT system has not been evaluated at all in the NTC 
implementation. Fortunately, some of the commercial sites in which the 
MicroCAT Testing System is used have provided more thorough tests of the 
system's adaptive testing capabilities. Even there, however, it may be several 
years before all of the extensive capabilities of the MicroCAT Testing System 
are given a challenging test. 

FUTURE PLANS FOR DIAGNOSTIC TESTING 

The MicroCAT Testing System has not yet been used for diagnostic testing; 
insufficient data have been collected to allow diagnostic tests to be developed. 
The programs are ready to implement such testing, however. 

MicroCAT does not include the diagnostic testing strategies because they 
are still under development and are not widely used. Diagnostic testing will be 
implemented using the custom interface included in MicroCAT. The custom 
interface allows users to link FORTRAN or Pascal procedures to MicroCAT. 
New scoring procedures can be included this way and are treated by MicroCAT 
in a manner similar to the standard scoring procedures (i.e., they are executed 
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each time a score is needed). Similarly, test execution can jump directly to a 
custom procedure through the execution of a procedure call in the test 
specification. 

Using these custom interfaces, programmers at the University of Illinois 
wUl develop and revise the diagnostic procedures as needed. No modification 
to the MicroCAT Testing System itself will be required. 
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